These visualizations compare the demographics of COVID Cases to the Census percents. Bars represent the percent of cases and the dashed line represents the census percents for each category in the state of Tennessee. Bars that are under the line suggest those demographic groups are affected by COVID less than their state percent. Similarly, bars crossing the line suggest those demographic populations are affected more than their proportion of the state population.For example, it has been reported that males are more affected than females. In Tennessee, it appears that case percents are proportional to the sexes. There are more females in the state of Tennessee, and we see more cases of females positively diagnosed with COVID.
The Tennessee Coronavirus Dashboard
The sole intention of this Coronavirus dashboard is to provide a visual overview of the 2019 Novel COVID-19 as it relates to counties in Tennessee. The data is acquired from two different sources, and there are no guarantees on the accuracy of the data becaues of differences in numbers reported and reporting time.
Note: This dashboard has different graphs for small screens. For more interactive graphs, please view this website on a large screen (computer/large table).
Data
Data for “Cases across time in most populous counties” is a concatenation of the New York Times Coronavirus Data and the Tennessee State Data Center, which acquires its data from the TN Department of Health
Latest data from 04-21.
Population data acquired from the US Census.
Created by Malle Carrasco-Harris.
---
title: "COVID-19 | Tennessee"
output:
flexdashboard::flex_dashboard:
orientation: rows
vertical_layout: scroll
social: menu
source_code: embed
knit: (function(input_file, encoding) {
out_dir <- 'docs';
rmarkdown::render(input_file,
encoding=encoding,
output_file=file.path(dirname(input_file), out_dir, 'index.html'))})
---
```{r setup, include=FALSE}
library(flexdashboard)
library(readr)
library(ggplot2)
library(tidyverse)
library(dplyr)
#Acquire Data####
#Load NY Times Data###
nyt_path = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv'
counties = read_csv(url(nyt_path)) #Originally contains all counties in US.
#Separate State
tn = counties[ which(counties$state =='Tennessee'),]
tn = tn[which(tn$date < '2020-03-31'),] #The Tennessee data from the new source has data starting March 31
#Tennessee data from online
tn_state = 'https://myutk.maps.arcgis.com/sharing/rest/content/items/32b104abc5d841ca895de7f7c17fc4dc/data'
download.file(tn_state,'TN_COVID19_CountyDaily.xlsx')
tn_daily = readxl::read_excel('TN_COVID19_CountyDaily.xlsx',sheet=1) %>%
filter(DATE > '2020-03-30') %>%
select(DATE, COUNTY, TEST_POS, TEST_NEG, DEATHS_TOT) %>%
filter(COUNTY != 'Balance')
names(tn_daily) = c('Date', 'County', 'Positive', 'Negative', 'Death')
tn_daily$County = ifelse(tn_daily$County == 'Non-Tennessee Resident',
"Out of TN",
tn_daily$County)
tn_daily$County =ifelse(tn_daily$County == 'Dekalb',
'DeKalb',
tn_daily$County)
tn_daily$County =ifelse(tn_daily$County == 'VanBuren',
'Van Buren',
tn_daily$County)
tn_daily$County = sub('ff', "ff", tn_daily$County) #Coffee and Jefferson have some weird issues.
#Check counties
#levels(as.factor(tn_daily$County))
tn_daily$County = as.factor(tn_daily$County)
#tn_daily$Date = as.Date(tn_daily$Date, format = '%m/%d/%y')
#Merge NYT and Tn Daily dataframes####
tn_daily2 = tn_daily[,c('Date','County', 'Positive', 'Death')]
names(tn_daily2) = c('date','county', 'cases', 'deaths')
tn_daily2 = tn_daily2[!(tn_daily2$county =='Out of TN' | tn_daily2$county =='Pending'),]
tn_daily2 = tibble::add_column(tn_daily2, state = 'Tennessee', .after='county')
fips_daily =tn %>% group_by(county, fips) %>% tally()
tn_daily2 = left_join(tn_daily2, fips_daily[,1:2], by ='county')
##Row bind tn_daily (TN Health Dept) with tn
tn = rbind(tn, tn_daily2) #Rbind will automatically put the correct columns together.
#Add population ####
#Get Population for counties in Tennessee
uscensus = 'https://raw.githubusercontent.com/mfcarrasco/COVID-TN-Counties/master/county_pop_2019.csv'
tn_pop = read_csv(url(uscensus))
tn_pop = tn_pop[ which(tn_pop$State =='Tennessee'),]
tn_pop = tn_pop[-1,c(2:3)]
tn_pop$County = gsub(' County', '', tn_pop$County)
tn_pop$Population = as.numeric(tn_pop$Population)
tn_pop = tn_pop[, c('County', 'Population')]
names(tn_pop) = c('county', 'population')
#tn_pop[order(-tn_pop$population),]
##Combine tn (NYT) dataframe with Population
tn = left_join(tn, tn_pop, by='county')
tn$county = as.factor(tn$county)
#Calculate per million
tn['cases_per_million'] = (tn$cases/tn$population)*10^6
#Keep most recent for tn_daily
tn_daily = tn_daily %>% group_by(County) %>% top_n(1, Date)
#Clean the global environment###
rm(list=ls()[!ls() %in% c('tn', 'tn_daily')])
#Value Box Calculations
tn_ext = readxl::read_excel('TN_COVID19_COUNTYDaily.xlsx',sheet=1) %>%
top_n(1,DATE) %>%
select(DATE:RATE_CHG_1DAY,RECOV_TOT:ACTIVE_NEW) %>%
filter(COUNTY != 'Balance')
tn_ext$COUNTY = ifelse(tn_ext$COUNTY == 'Non-Tennessee Resident',
"Out of TN",
tn_ext$COUNTY)
tn_ext$COUNTY =ifelse(tn_ext$COUNTY == 'Dekalb',
'DeKalb',
tn_ext$COUNTY)
tn_ext$COUNTY =ifelse(tn_ext$COUNTY == 'VanBuren',
'Van Buren',
tn_ext$COUNTY)
tn_ext$COUNTY = sub('ff', "ff", tn_ext$COUNTY) #Coffee and Jefferson have some weird issues.
tn_ext$COUNTY = as.factor(tn_ext$COUNTY)
#Total Cases
total_cases = sum(tn_ext$TEST_POS)
total_negative = sum(tn_ext$TEST_NEG)
total_death = sum(tn_ext$DEATHS_TOT)
total_recov = sum(tn_ext$RECOV_TOT)
active_cases = total_cases - total_death - total_recov #sum(tn_ext$ACTIVE_TOT)
total_tests = total_cases + total_negative
```
Data Visualizations by County
=======================================
Rows {data-width = 150}
-----------
### Confirmed Cases to Date
```{r}
#Total Positive Cases
cases_per = ((total_cases/total_tests)*100) %>% round(1) %>% paste0('%')
total_cases_vb = total_cases %>% formattable::comma(digits=0) %>% paste0(' (',cases_per,')')
valueBox(value = total_cases_vb, icon='fa-user-plus', color='#002D65')
```
### Negative Tests
```{r}
#Total Negative Cases
negative_per = ((total_negative/total_tests)*100) %>% round(1) %>% paste0('%')
total_negative_vb = total_negative %>% formattable::comma(digits=0) %>% paste0(' (', negative_per,')')
valueBox(value = total_negative_vb, icon='fa-user-minus', color='#CC0000')
```
Rows {data-width = 150}
-----------
### Recovered Cases: `r total_recov %>% formattable::comma(digits=0)`
```{r}
recov_label = (total_recov %>% formattable::comma(digits=0))
recov_per = ((total_recov/(total_cases))*100) %>% round(1)
gauge(recov_per, min=0, max = 100, symbol = '%')
```
### Active Cases: `r active_cases %>% formattable::comma(digits=0)`
```{r}
active_per = ((active_cases/(total_cases))*100) %>% round(1)
gauge(active_per, min=0, max = 100, symbol = '%',
gaugeSectors(
success = c(0,25), warning = c(26,100)))
```
### Total Deaths: `r total_death %>% formattable::comma(digits=0)`
```{r}
#Total Deaths Cases
death_per = ((total_death/total_cases)*100) %>% round(1) %>% paste0('%')
gauge(death_per, min=0, max = 100, symbol = '%',
gaugeSectors(
success = c(0,5), warning = c(6,100)))
```
Column {data-width=650}
-----------------------------------------------------------------------
### Cases across time in most populous counties
```{r}
library(plotly)
tn_top =c('Shelby', 'Davidson', 'Knox', 'Hamilton', 'Rutherford', 'Williamson')
tn_top = tn[ tn$county %in% tn_top,]
t_line = tn_pop_line =ggplot(data=tn_top, aes(x=date, y=cases, color=county))+
geom_line(size=1)+
scale_x_date(expand = c(0,0), date_breaks = '2 day', date_labels = '%b %d')+
labs(x='', y='Cases')+
theme(legend.title = element_blank(), panel.background = element_blank(), axis.line.x=element_line(), axis.line.y.left = element_line(), axis.text=element_text(face='bold'),axis.text.x = element_text(angle=45, hjust=1))+
scale_color_brewer(palette = 'Spectral',direction=-1)
ggplotly(t_line)
```
### Cases across time in most populous counties {.mobile}
```{r}
tn_top =c('Shelby', 'Davidson', 'Knox', 'Hamilton', 'Rutherford', 'Williamson')
tn_top = tn[ tn$county %in% tn_top,]
ggplot(data=tn_top, aes(x=date, y=cases, color=county))+
geom_line(size=1)+
scale_x_date(expand = c(0,0), date_breaks = '1 week', date_labels = '%m-%d')+
labs(x='', y='Cases')+
theme(legend.title = element_blank(),
panel.background = element_blank(),
axis.line.x=element_line(),
axis.line.y.left = element_line(),
axis.text=element_text(face='bold'),
axis.text.x = element_text(angle=45, hjust=1),
legend.position = c(0,.9),
legend.justification = c(-0.1,.8))+
scale_color_brewer(palette = 'Spectral',direction=-1)
```
Row {data-width=400}
-------------------------
### Cases rate
```{r}
library(usmap)
library(viridis)
tn_geo =tn %>% group_by(county) %>% top_n(1,date)
tn_geo = tn_geo[!(tn_geo$county =='Unknown'),]
tn_geo$fips =fips(state = 'TN', county=tn_geo$county)
library(rjson)
url = 'https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json'
json_file <- rjson::fromJSON(file=url)
#Create map
fig <- plot_ly() %>%
add_trace(
text = paste(tn_geo$county,' County'),
hoverinfo = 'text',
type='choroplethmapbox',
geojson= json_file,
locations=tn_geo$fips,
z = tn_geo$cases_per_million,
zmin=0,
zmax = round(max(tn_geo$cases_per_million),-3),
colorscale='Viridis',
marker=list(line=list(
width=0),
opacity=0.9)) %>%
layout(mapbox=list(
style="carto-positron",
zoom =5.1,
center=list(lon= -86.7816, lat=36.1627))) %>%
colorbar(title = "Cases per million")
fig
```
### Case numbers by county
```{r}
#Trying Data table
tn_daily[,2:5] %>%
DT::datatable(rownames = FALSE,
colnames = c('County', 'Confirmed', 'Negative', 'Death'),
options = list(pageLength = 10))
```
Column {data-width=350, data-height=400}
-----------------------------------------------------------------------
### Positive cases by counties with more than 20 cases
```{r}
tn_cases = tn_daily[which(tn_daily$Positive >20 &
tn_daily$County != 'Pending' &
tn_daily$County != 'Out of TN'), c('County', 'Positive','Negative','Death')] #Remove where there are no cases
plot_ly(data=tn_cases,
x=tn_cases$Positive,
y=reorder(tn_cases$County, tn_cases$Positive),
type='bar',
orientation='h',
marker= list(color='#002D65')) %>%
layout(xaxis = list(title= 'Count',
zeroline = FALSE,
showline = F,
showticklabels = T,
showgrid = T),
yaxis = list(showgrid = FALSE,
showline = FALSE,
showticklabels = TRUE,
dtick=1,
tickfont = list(size=10)))
```
### Positive cases by county {.mobile}
```{r}
tn_cases = tn_daily[which(tn_daily$Positive > 20 &
tn_daily$County != 'Pending' &
tn_daily$County != 'Out of TN'), c('County', 'Positive','Negative','Death')] #Remove where there are no cases
ggplot(data=tn_cases,aes(x=Positive, y=reorder(County,Positive)))+
geom_col(fill='#002D65')+
ylab('')+
xlab('')+
theme(panel.background = element_blank(),
axis.line.x=element_line(),
axis.line.y.left = element_line(),
axis.text=element_text(face='bold'),
axis.ticks = element_blank())+
scale_x_continuous(expand= c(0,0))+
ggtitle("Counties with more than 20 cases")
```
### All outcomes by counties with more than 50 cases
```{r}
tn_cases = tn_daily[which(tn_daily$Positive > 50 &
tn_daily$County != 'Pending' &
tn_daily$County != 'Out of TN'), c('County', 'Positive','Negative','Death')] #Remove where there are no cases
plot_ly(data=tn_cases,
x= reorder(tn_cases$County, tn_cases$Negative),
y=tn_cases$Negative,
type='bar',
name='Negative Cases',
marker= list(color='#002D65')) %>%
add_trace(y = tn_cases$Positive,
name='Positive Cases',
marker = list(color='grey')) %>%
add_trace(y = tn_cases$Death,
name='Deaths',
marker = list(color='#CC0000')) %>%
layout(barmode = 'stack',
xaxis = list(showgrid = FALSE,
showlilnee = FALSE,
showticklabels = TRUE,
dtick=1,
tickfont =list(size=10)),
yaxis = list(title= 'Count',
zeroline = FALSE,
showline = F,
showticklabels = T,
showgrid = T),
hovermode = 'compare')
```
### All Cases by County {.mobile}
```{r}
library(plotly)
tn_cases = tn_daily[which(tn_daily$Positive > 50 &
tn_daily$County != 'Pending' &
tn_daily$County != 'Out of TN'), c('County', 'Positive','Negative','Death')]
plot_ly(data=tn_cases,
x= reorder(tn_cases$County, tn_cases$Negative),
y=tn_cases$Negative,
type='bar',
name='Negative Cases',
marker= list(color='darkblue')) %>%
add_trace(y = tn_cases$Positive,
name='Positive Cases',
marker = list(color='grey')) %>%
add_trace(y = tn_cases$Death,
name='Deaths',
marker = list(color='#CC0000')) %>%
layout(barmode = 'stack',
title='Counties with more than 50 cases',
xaxis = list(showgrid = FALSE,
showlilnee = FALSE,
showticklabels = TRUE,
dtick=1,
tickfont =list(size=10)),
yaxis = list(title= 'Count',
zeroline = FALSE,
showline = F,
showticklabels = T,
showgrid = T),
hovermode = 'compare')
```
Data Visualizatons by Demographics
==================================
Column {data-width=350, data-height=450}
---------------------------
### Confirmed Cases by Age
```{r}
#Get US Census Data
census_demo = 'https://raw.githubusercontent.com/mfcarrasco/COVID-TN-Counties/master/census_demographics.xlsx'
age_census = readxl::read_excel('census_demographics.xlsx',sheet='Age')
age_census$Percent = round(age_census$Percent, 1)
names(age_census)[3]='Census_Percent'
#Get TN Data
tn_age = 'https://myutk.maps.arcgis.com/sharing/rest/content/items/1bdfe86c38514c9c878241d5230d9a85/data'
download.file(tn_age,'TN_Age.xlsx')
tn_age = readxl::read_excel('TN_Age.xlsx',sheet=1) %>%
top_n(1,DATE) %>%
select(DATE, AGE, TOT_CASE_COUNT, DEATHS_TOT)
names(tn_age) = c('Date', 'Age_Ranges', 'Count', 'Deaths')
tn_age$Age_Ranges = as.factor(tn_age$Age_Ranges)
tn_age$Case_Percent = round((tn_age$Count/sum(tn_age$Count))*100,1)
tn_age$Death_Percent =round((tn_age$Deaths/sum(tn_age$Deaths))*100,1)
tn_age = tn_age[,c('Age_Ranges', 'Case_Percent','Death_Percent')] %>%
tidyr::gather('Percent', 'Value', -Age_Ranges)
tn_age = cbind(tn_age, age_census[,3])
tn_age$Census_Percent[c(10,20)] = NA
#Plot
fills = c('Case_Percent' = '#002D65', 'Death_Percent' = '#CC0000', 'Overall Population' = 'grey')
ggplot(tn_age,aes(x=Age_Ranges))+
geom_col(aes(y = Value, fill=Percent),position=position_dodge())+
geom_line(aes(y=Census_Percent, group = 1, color='population',), linetype = 'dashed')+
xlab('')+
ylab('Percent')+
theme(panel.background = element_blank(),
axis.line = element_line(),
axis.text = element_text(face = 'bold'),
axis.text.x = element_text(angle=30),
legend.title = element_blank(),
legend.direction='horizontal',
legend.position = c(.2,.90),
legend.box.just = 'left')+
scale_fill_manual(values= fills, labels=c('Cases', 'Deaths'))+
scale_color_manual(values = c('population' = 'black'),labels = 'Overall Population')
```
### Confirmed Cases by Sex
```{r}
#Get US Census Data
census_demo = 'https://raw.githubusercontent.com/mfcarrasco/COVID-TN-Counties/master/census_demographics.xlsx'
sex_census = readxl::read_excel('census_demographics.xlsx',sheet='Sex')
names(sex_census) = c('Sex', 'Census_Percent')
##stopped here####
tn_demo = 'https://myutk.maps.arcgis.com/sharing/rest/content/items/4ff6b762d64a4e0caa626df00a76c902/data'
download.file(tn_demo,'TN_Demographics.xlsx')
tn_demo = readxl::read_excel('TN_Demographics.xlsx',sheet=1) %>%
top_n(1,DATE) %>%
select(DATE, TYPE, DETAIL, TOT_CASE_COUNT) %>%
group_split(TYPE)
tn_sex = tn_demo[[2]] %>%
select(DETAIL, TOT_CASE_COUNT)
names(tn_sex) = c('Sex', 'Count')
tn_sex$Case_Percent = round((tn_sex$Count/sum(tn_sex$Count))*100,1)
tn_sex = dplyr::left_join(tn_sex[,c(1,3)], sex_census, 'Sex')
ggplot(data=tn_sex, aes(x=Sex))+
geom_col(aes(y=Case_Percent),fill='#002D65')+
geom_line(aes(y=Census_Percent, group=1, color='population'), linetype = 'dashed')+
xlab('')+
ylab('Percent')+
theme(panel.background = element_blank(),
axis.line = element_line(),
axis.text = element_text(face = 'bold'),
legend.title = element_blank(),
legend.direction='horizontal',
legend.position = c(.85,.90),
legend.box.just = 'left')+
scale_color_manual(values = c('population' = 'black'),labels = 'Overall Population')
```
Column {data-width=350, data-height=450}
---------------------------
### Confirmed Cases by Race
```{r}
#Get US Census Data
census_demo = 'https://raw.githubusercontent.com/mfcarrasco/COVID-TN-Counties/master/census_demographics.xlsx'
race_census = readxl::read_excel('census_demographics.xlsx',sheet='Race') %>%
select(Race = Race, Census_Percent =Percent)
race_census$Census_Percent = round(race_census$Census_Percent, 1)
#Get TN Data
tn_race = tn_demo[[3]] %>%
select(DETAIL, TOT_CASE_COUNT)
names(tn_race) = c('Race', 'Count')
tn_race$Percent = round((tn_race$Count/sum(tn_race$Count))*100,1)
tn_race = dplyr::left_join(tn_race[,c(1,3)], race_census, 'Race')
tn_race$Race = factor(tn_race$Race, levels = c('Asian', 'Black or African American', 'White', 'Other/Two or More Races', 'Pending'))
ggplot(data=tn_race, aes(x=Race))+
geom_bar(stat = 'identity',aes(y=Percent),fill='#002D65')+
geom_line(aes(y=Census_Percent, group=1, color='population'), linetype = 'dashed')+
xlab('')+
ylab('Percent')+
theme(panel.background = element_blank(),
axis.line = element_line(),
axis.text = element_text(face = 'bold'),
axis.text.x = element_text(angle=30, h = 1),
legend.title = element_blank(),
legend.direction='horizontal',
legend.position = c(.85,.90),
legend.box.just = 'left')+
scale_color_manual(values = c('population' = 'black'),labels = 'Overall Population')
```
### Confirmed Cases by Ethnicity
```{r}
#Get US Census Data
census_demo = 'https://raw.githubusercontent.com/mfcarrasco/COVID-TN-Counties/master/census_demographics.xlsx'
eth_census = readxl::read_excel('census_demographics.xlsx',sheet='Ethnicity') %>%
select(Ethnicity = Ethnicity, Census_Percent =Percent)
eth_census$Census_Percent = round(eth_census$Census_Percent, 1)
tn_eth = tn_demo[[1]] %>%
select(DETAIL, TOT_CASE_COUNT)
names(tn_eth) = c('Ethnicity', 'Count')
tn_eth$Percent = tn_eth$Count/sum(tn_eth$Count)*100
tn_eth = dplyr::left_join(tn_eth[,c(1,3)], eth_census, 'Ethnicity')
tn_eth$Ethnicity = factor(tn_eth$Ethnicity, levels = c('Hispanic or Latino','Not Hispanic or Latino', 'Pending'))
ggplot(data=tn_eth, aes(x=Ethnicity))+
geom_bar(stat = 'identity',aes(y=Percent),fill='#002D65')+
geom_line(aes(y=Census_Percent, group=1, color='population'), linetype = 'dashed')+
xlab('')+
ylab('Percent')+
theme(panel.background = element_blank(),
axis.line = element_line(),
axis.text = element_text(face = 'bold'),
axis.text.x = element_text(angle=30, h = 1),
legend.title = element_blank(),
legend.direction='horizontal',
legend.position = c(.85,.90),
legend.box.just = 'left')+
scale_color_manual(values = c('population' = 'black'),labels = 'Overall Population')
```
Row
----
These visualizations compare the demographics of COVID Cases to the Census percents. Bars represent the percent of cases and the dashed line represents the census percents for each category in the state of Tennessee. Bars that are under the line suggest those demographic groups are affected by COVID less than their state percent. Similarly, bars crossing the line suggest those demographic populations are affected more than their proportion of the state population.For example, it has been reported that males are more affected than females. In Tennessee, it appears that case percents are proportional to the sexes. There are more females in the state of Tennessee, and we see more cases of females positively diagnosed with COVID.
About
================================
**The Tennessee Coronavirus Dashboard**
The sole intention of this Coronavirus dashboard is to provide a visual overview of the 2019 Novel COVID-19 as it relates to counties in Tennessee. The data is acquired from two different sources, and there are no guarantees on the accuracy of the data becaues of differences in numbers reported and reporting time.
Note: This dashboard has different graphs for small screens. For more interactive graphs, please view this website on a large screen (computer/large table).
**Data**
Data for "Cases across time in most populous counties" is a concatenation of the [New York Times Coronavirus Data](https://github.com/nytimes/covid-19-data) and the [Tennessee State Data Center](https://myutk.maps.arcgis.com/home/group.html?id=c98fc99308dd43fb98146d3cf21fc31c&q=tags%3A%22COVID-19%22&view=list&focus=files#content), which acquires its data from the [TN Department of Health](https://www.tn.gov/health/cedep/ncov.html)
Latest data from `r max(tn$date) %>% format('%m-%d')`.
Population data acquired from the [US Census](https://data.census.gov/cedsci/table?q=Tennessee%20race%20demographics&g=0400000US47&tid=ACSDP1Y2018.DP05&hidePreview=true).
Created by [Malle Carrasco-Harris](https://www.linkedin.com/in/malle-carrasco-harris).